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ABSTRACT: We explain a method, inspired by control the- 
ory model reduction and interpolation theory, that rigorously 
establishes the types of coarse graining that are appropriate 
for systems with quadratic, generalized Hamiltonians. For such 
systems, general conditions are given that establish when lo- 
cal coarse grainings should be valid. Interestingly, our analysis 
provides a reduction method that is valid regardless of whether 
or not the system is isotropic. We provide the linear harmonic 
chain as a prototypical example. Additionally, these reduction 
techniques arc based on the dynamic response of the system, 
and hence are also applicable to noncquilibrium systems. 
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1 Introduction 



Despite many of the great successes of statistical mechanics, it still lacks 
adequate methods for systematically treating heterogeneous and nonequi- 
librium systems. This is especially disconcerting considering that much 
of the world about us is both heterogeneous and far from equilibrium. In 
addition, the treatment of open systems and systems with nontrivial bound- 
ary conditions have yet to be systematically incorporated into statistical 
mechanics 2 . It is not the purpose of this paper to address all of these defi- 
ciencies. Rather, we introduce techniques from control theory engineering 
and interpolation theory to shed new light on such problems. 

It is a standard practice in physics to simplify complicated systems. In 
particular limits, such as high-temperature or low-density, these idealiza- 
tions may become exact. We will discuss two of the main methods used in 
physics to construct reduced-order models. 

The projection-operator formalism (POF) of Mori and Zwanzig 0013 
112) is a method from nonequilibrium statistical mechanics. It allows contact 
between the constitutive conservative microscopic equations and the more 
macroscopic phenomenological Langevin equations. The key mathematical 
ingredient in this approach, given an arbitrary observable, is to project 
along particular "directions" in state space in order to obtain an alternative 
evolution equation involving contributions from a forcing term and from 
a memory kernel. Here the projections involved are simply integrations 
over the appropriate phase space variables. A textbook application of the 
POF is a particle in a heat bath [51 151 IT51 12"]. In this example, there is 
a clear split between important (system) variables and the less important 
(environment) variables. Thus, taking the system variables as the particle's 
position and momentum justifies projecting out the bath variables. 

The renormalization group (RG) from field theory and equilibrium sta- 
tistical mechanics [331 134| . in its original form, involves identifying how 
the physics of a system changes with scale. Equivalently, the renormal- 
ization group identifies how the parameters of a system's Hamiltonian or 
Lagrangian vary as the system is coarse grained. In the RG, systems are, 
almost invariably, locally coarse grained (i.e. locally-averaged). In the 
context of equilibrium statistical mechanics, the coarse graining is realized 
with the appropriate partial trace of a Boltzmann weight. Formally, the 
partial trace is equivalent to the projections used in the POF. 

An important observation is that both the POF and RG are completely 
general techniques. Although typical system reductions are either based on 
a priori system-environment splits or obvious symmetries dictating local 
coarse graining, there is enormous ambiguity in choosing which states to 

2 Of course the latter concern is somewhat atypical considering that boundary terms 
are usually deemed unimportant in the thermodynamic limit. 
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trace out 3 . Intuition is enough of a guide for determining how to coarse 
grain homogeneous systems with local interactions. However, without some 
direction for dealing with heterogeneous systems, possibly with nonlocal in- 
teractions, the POF and RG are too general; they are useless. For instance, 
locally averaging about the interface in a layered system loses important 
information about the system. Additionally, locally averaging such systems 
is actually more likely to complicate the model. Complications arise since 
the averaged theory would pick up extra couplings to enforce the constraint 
of well-defined boundaries and induce couplings between the bulk of the 
different layers. In short, for general systems, local coarse graining is likely 
to discard important details. Consequently, as the effective influence of 
these discarded details is reincorporated into the coarsened description of 
the system, the new effective theory becomes increasingly complicated. 

The above considerations support the view advocated in the work by 
Bricmont and Kupiainen (HTI llffil 13*5] , They contend that systems should 
not be blindly coarse grained scale by scale, but rather, large fluctuations 
should remain fixed while those degrees of freedom corresponding to small 
fluctuations are integrated away. A direct consequence of this perspective 
is that nonlocal coarse graining is on the same footing as its local counter- 
part. Intuitively, internal states that cause the largest fluctuations are the 
most relevant. The problem with such a program is that there exists no 
general framework that allows for an unambiguous measure of the relative 
importance of a system's internal degrees of freedom. 

It is our claim that methods from control theory and modern interpo- 
lation theory provide a complete, general framework for determining how 
to appropriately coarse grain linear and linearly-dominated nonlinear sys- 
tems. Consequently, this opens up many new possible avenues to address 
the full nonlinear problem [2711281 129] . The primary idea of this approach is 
to coarse grain a system based on its dynamic response. For linear systems, 
it is possible to develop a completely unambiguous measure of how the in- 
ternal states of a system contribute to the response. In other words, it is 
possible to assign a relative importance to the internal degrees of freedom. 
Determining this measure then dictates how the system should be coarse 
grained. An especially nice feature of the control theory analysis is that 
it decomposes the response into two separate, physically intuitive, parts: 
the controllability and observability of the internal states. Furthermore, 
these techniques are not limited to the idealized setting in which all of the 
internal degrees of freedom of the system can be perfectly measured. In 
fact, these methods were tailored to deal with physical systems in an ex- 

3 Taking the partial trace of the probability density (Boltzmann weight) produces 
the probability density for the corresponding random variable. Thus, the only real 
constraint one should put on the partial trace is that it corresponds to a measurable 
random variable. 
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pcrimental setting! They are applicable even if the actuators and sensors 
interfaced with the system are imperfect. These methods are not only of 
great theoretical use; they are of practical use as well. 

In work by Hartle and Brun it is speculated that local coarse 

graining produces more deterministic effective equations of motion than 
nonlocal coarse graining. The problem with this claim is that it was made 
based on investigating the homogeneous linear harmonic chain on Zjv (i.e. 
on a ring) and considering a set of measure zero of all possible ways to 
coarse grain the system. The main result of our paper rigorously establishes 
for what (linear) systems the above claim is true and how it breaks down 
for general linear systems. A primary instance when it breaks down is 
for heterogeneous systems. We also establish how to appropriately coarse 
grain systems when local coarse graining breaks down. 

This paper serves two functions; (1) to introduce and integrate basic 
concepts from control theory into standard physics problems, and (2) to 
develop and apply a new algorithm for coarse graining that complements 
existing physical reduction techniques. In Section[21we provide background 
material on the open loop control of linear systems. The definitions of con- 
trollability and observability are made precise. The controllability and 
observability operators and gramians are then introduced. From these ob- 
jects, we establish a simultaneous measure of controllability and observ- 
ability. This measure specifies the relative importance of different internal 
degrees of freedom. It also dictates how to model reduce or, equivalcntly, 
to coarse grain. Appendix ^ contains important details that generalize 
the control theory model reduction techniques in Section [2 to conservative 
and unstable systems. The lower bound in Appendix lAl is a new result. 
Lastly, in Section we apply model reduction techniques to oscillator sys- 
tems to determine the "natural" reductions they admit. We see that under 
some circumstances, depending on the spectral content of the system, local 
coarse graining is valid. We also show how to coarse grain a system even 
if it is not homogeneous and isotropic. Local coarse graining cannot be 
expected to be appropriate for general quadratic Hamiltonians. In fact, 
our analysis shows the precise manner in which it is not. For illustrative 
purposes, we examine the linear harmonic chain in detail. 

2 A control theory tutorial 

This section describes how control theory methods, in particular Hankel 
norm analysis, may be used to determine the relative importance of the 
internal degrees of freedom for arbitrary linear systems. A state's impor- 
tance is directly related to its contribution to the system's response. In this 
section, we introduce the requisite control theory terminology and notation 
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that will be used throughout. 

In the opening subsection, we introduce the linear systems under inves- 
tigation, their corresponding input-output behavior (response), and some 
requisite material on the realization theory of input-output operators. In 
the next subsection, we provide definitions and measures of controllabil- 
ity and observability. The Hankel operator, its interpretation in terms of 
controllability and observability, and its relation to balanced realizations 
comprise the final subsection. The latter topics are especially important in 
control theory model reduction and, consequently, also for coarse graining. 
Although we made no attempt for this tutorial to be an exhaustive review, 
we include enough detail for the paper to be self-contained. All theorems 
in this section are stated without proof. The interested reader is encour- 
aged to consult the following references [201 EU El El E3 • Those who are 
already familiar with the above concepts may comfortably skip ahead to 
Section El 

2.1 Linear Systems and Realizations 

This paper concerns linear time invariant (LTI) systems (i.e. linear systems 
with time translation invariance) of the form: 



where x is the "internal" state of the system, y represents quantities directly 
measured by appropriately positioned sensors, and u represents the external 
driving force. The matrix A captures the natural dynamics of the system, 
while, respectively, B and C dictate which internal states of the system 
are in contact with the driving and measurement. D is responsible for 
the feedthrough of the system. Feedthough is the (possibly amplified) 
contribution of the driving that is directly measured. A control theoretic, 
diagram representing this system is in Figure ^ Expressing the system in 
this form reflects that only partial information is measured and that the 
system is open. Since the system is LTI, A, B, C, and D are constant 
coefficient matrices. 

The general solution to the above problem is given by: 



x = Ax + Bu 
y = Cx + Dm ' 




(1) 
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zero input response 
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If we consider only the zero state response (trivial initial conditions), then 

y = Gu 

= J >C(t, T)u(r)dT = £ Ce A ( t - T )Bw(r)dr. (3) 

The integral kernel has many names. In the time domain it is referred to 
as the impulse response or the Green's function. Alternatively, for stable, 
LTI systems, the Fourier transform of the integral kernel is also known 
as the transfer matrix, Green's function in the frequency domain, or the 
propagator. Since feedthrough is not crucial in our analysis, D = from 
this point forward. 

" I 




Figure 1: A block diagram representation of the linear system from Q and 
Directed lines flowing into boxes represents vectors being multiplied by operators 
(or matrices). For instance, initially u flows into B, hence the output of the first 
box is Bu. The circles in the diagram are adders. Vectors that flow into adders 
are summed. 

In control theory, a system is specified by an experiment. This is re- 
flected by the dependence of G on B,C, and D. A system is defined by its 
response (i.e. by G). From an experiment, the only available data is from 
the inputs and outputs. Hence, the matrices (A, B,C,D) are unknown. 
Constructing all of such matrices corresponding to a given response is the 
objective of realization theory. The system matrices (A, B,C,D) form 
a state space realization of the system. For a given system, there does 
not exist a unique realization. However, given a realization, there exists 
a unique system. The choice of experiment fixes the system's inputs and 
outputs. The remaining ambiguity is due to the internal states of the 
system. For instance, an arbitrary invertible, linear change of variables, 
x = Rz, demonstrates this. (R _1 AR, R _1 B, CR, D) is also a realiza- 
tion of G. Since G is invariant under the above similarity transformations, 
realizations belong to equivalence classes. The remaining indeterminatc- 
ness arises since there is not a bound on the number of internal degrees 
of freedom. In fact, there may be arbitrarily many internal states that 
do not contribute to the system's response. State space realizations that 
have the minimal internal state dimension arc called minimal realizations 
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4 . Although G is typically an infinite rank operator (i.e. the image of G 
is infinite dimensional), the internal state dimension of its minimal realiza- 
tions gives it an order. When the minimal realization is stable, the order 
of G is also known as its McMillan degree. 

Minimal realizations represent the part of the system that is observ- 
able and controllable. To clarify this, we will introduce the definitions of 
controllability and observability. We then define the controllability and 
oberservability operators along with their respective gramians. These con- 
cepts are imperative to assigning a measure of how much an individual 
internal state contributes to the system's response. 

2.2 Controllability and Observability 

• Controllability 

Controllability concerns the effects of driving on the system. In particu- 
lar, a system is controllable if it is possible to drive a system from any initial 
configuration to any final configuration. An internal state of a system is 
considered uncontrollable if it cannot be driven to every other state. 

The issue of whether or not a system (or state) is controllable is a 
yes-or-no question. However, we may still intuitively assign a degree of 
controllability to a state. An example of this is to consider the response of a 
conservative system when it is driven at one of its characteristic frequencies 
(at resonance) . This is mathematically realized by a divergence (or a peak, 
in general) in the Fourier transform of the response. This is the simplest 
example of a system's mode being very controllable, insofar that we can 
elicit a large response from small amplitude driving 5 . It is easy to drive 
states in the direction of this mode. Generally input-output resonances 
do not always correspond to internal resonances. Should there be a set 
or subspace of state or phase space that can not be reached via driving, 
then such "directions" are uncontrollable. A system is controllable if every 
direction in state or phase space is controllable. The following provides a 
more formal and precise definition of controllability. 

Definition 2.2.1 A system is controllable if it is possible to drive any 
initial state xo to any final state x / in any nonzero time interval. 

• Observability 

Observability describes how easily the internal state of the system can 
be reconstructed from measurements of the output. Intimately connected 

4 Minimal realizations of a given state dimension all belong to the same equivalence 
class. 

5 Driving with small amplitude is also termed driving or forcing with small gain. 
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to this is the precise determination of the internal initial conditions. Ini- 
tial conditions that cannot be reconstructed are the system's unobservable 
states. 

The mechanical model of a particle in a heat bath provides a physical 
example of observable versus unobservable states. As alluded to earlier, 
such as system admits a natural system-environment split. In this case the 
system is the single oscillator, while the environment is the bath. The oscil- 
lator is the primary object under investigation and hence, an experimental 
apparatus is devised to measure its displacement and/or velocity. Since the 
bath is composed of innumerable constituent particles (or oscillators) , the 
individual trajectories of the bath particles are unknown. While the single 
oscillator is strongly observable, the bath is only weakly observable. It is 
possible to reconstruct the initial conditions for the single oscillator but not 
for the entire bath. A more precise and formal definition of observability 
is: 

Definition 2.2.2 A system is observable if it is possible to fully determine 
any initial state xq by measuring y over any nonzero time interval. 

• Controllability and observability operators 

It is clear that the input-output operator, G, from equation JSJ) takes in 
inputs from u from some space and outputs y in another. For concreteness, 
from now on we will consider the domain of G to be £™[—T,T) (i.e. m 
copies of £2) and the range to be C^—T.T]. More generally, u may also 
be a vector in C™, £™, or a Langevin contribution to the dynamics. The 
construction of the controllability and observability operators, ^ c and ^f Q 
respectively, is largely motivated by the fact that the Hankel operator, to 
be introduced later, can be factored into their product. Thus, the response 
may be decomposed into observability and controllability. 

The controllability operator is defined by: 

9 D :£?[-T,0]^R n 
»£ c u = f° T e- AT Bu(T)dT = / T e AT Bu(-T)dr ( ' 

Formally ty c is not defined on the full domain of G. It can be extended to 
the full space, however, by defining £"[0,^ to be in its null space. The 
controllability operator allows for an algebraic definition of controllability. 

Theorem 2.2.3 A linear system, as in equation QJ, specified by (A, B, C, D) 
is controllable if and only if the image of ^> c is all of W 1 . 

If a system in equation JJ) is controllable, we call the system pair (A, B) 
controllable. Additionally, the space of states that are controllable forms 
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an A-invariant subspace. The controllable subspace is precisely the image 

Similarly, the observability operator is defined by: 

* o :K"^£?[0,T] 

= Ce At z 1 j 

In contrast to controllability, the set of observable states do not form an in- 
variant subspace. The span of the unobservable states forms an A-invariant 
subspace 6 . The null space of Null(^ ), comprises the unobservable 
subspace. This implies that a system is observable provided that ^> has 
full rank (i.e. the null space is empty). This gives the formal algebraic 
definition of observability. 

Theorem 2.2.4 (Test for observability) A linear system given by JU 
is observable iff rank(^ r ) = n (i.e. ^f Q has full rank). 

If a system is observable, we call the system pair (C, A) observable. Con- 
trollability and observability are completely dual to each other. For exam- 
ple, (A, B) is controllable if and only if (B^, A^) is observable. 

• Controllability and observability gramians 

Superficially it may seem that the above operators only give us limited 
information. Specifically, we only have binary tests for controllability and 
observability based on whether or not the state is in TZ(^f c ) or in Null(^ ). 
Our objective is to determine how observable and controllable a state is 
in order to quantify its contribution to the response. It is precisely the 
controllability and observability gramians that provide this information. 
However, as will soon become apparent, the operators are intimately related 
to the gramians. 

Determining TZ(^> C ) and Null(^f ) is a formidable challenge since the 
domain of ty c and the range of ^f a are infinite dimensional spaces. How- 
ever, the formal operator adjoint makes the problem more tractable. Since 
* c : C?[-T,Q] -> R" this then implies that *t : R« _> £™[-T,0], where 
■$1 is the operator adjoint of * c . Similarly, *t : £§[0,T] -> R n . An 
important property of the adjoint of an operator is that its image is per- 
pendicular to the original operator's null space, that is TZ(T^)J-Null(T). 
Also n{^\)LNull{^ c ) and Null{^ ) ±11(^1 ). Consequently, TZ(^ C ) = 
ft(# e *t) and N uU^o) = Null(^l^ ). However, since <&t maps R n to 
a?[-T,0] and *J maps £|[0,T] to R n , # c #| and *J* are n X n ma- 
trices. Finding the controllability and observability subspaces reduces to 
discovering the images and null spaces ofnxn matrices. 

6 This and its controllable analog are important because they are responsible for the 
Kalman decomposition. 
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From the definition of the adjoint, the expression for is 



*t : R n ->£y[-r,0], 
*1> = Bt e - At *z For all zel", i G [-T,0]. 1 j 



The following relation holds for 

*t :£P[0,T] 



^/ = / T e At ^Ct/(r)dr. 



From above, we are left with objects that are of fundamental importance 
for establishing quantitative measures of controllability and observability, 
the gramians. The controllability gramian, W c , is defined by: 



W C (T) = * c tft 
= r T e At BBte Att d* 



(8) 



The observability gramian, W Q , is defined by: 

W (T) = tft* o 
= / T e Att CtCe A *di. 



(9) 



From their definitions, the gramians are both self-adjoint and positive semi- 
definite. Additionally, the controllable subspace of the system is the image 
of W C (T). Thus, a linear system is controllable if and only if W C (T) is 
nonsingular (invertible) . Similarly, the unobservability subspace is the null 
space of W D (T). A linear system is then observable if and only if W G (T) 
is nonsingular (i.e. the null space is empty). Consequently, controllability 
and observability are determined by calculating two matrices, W C (T) and 
W Q (T). Equations JSJ and (JjJJ are computationally not very useful. It is 
typically easier to determine the gramians from the equations that they 
satisfy. 

' m ' n = AW C + W c A f + BBt; W c (0) = (10) 



dT 
dW c 

dT 



= AtWo + WoA + CtC; W o (0) = (11) 



For stable systems, in the limit as T — > oo, the gramians satisfy algebraic 
Lyapunov equations 7 . 

7 The Lyapunov equations are AW c +W c At + BB+ = and AtW +W A + CtC = 

0. 
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Figure 2: (a) depicts a controllability ellipsoid, while (b) depicts an observ- 
ability ellipsoid. The semimajor axis of the ellipsoid in (a) indicates the most 
controllable direction in state space, and for the ellipsoid in (b) it indicates the 
most observable direction. 

The directions in state or phase space corresponding to trivial eigen- 
values of W c are uncontrollable. Therefore, the eigenvectors of W c cor- 
responding to small eigenvalues are only weakly controllable. It is along 
those directions that the controllability gramian is almost singular. Phys- 
ically, it requires much higher gains to reach these states than the more 
controllable states. More precisely, consider the quadratic energy func- 
tional F(u) = f_ T ||u(£)||jjj m df = IM|£™[_to] tnat measures the energy due 
to driving. The u that expends the least energy to reach a state x £ R n 
from the origin 8 is given by u m i n = v t , JW~ 1 x. The energy due to such 
driving is 

= {$tw c - 1 i,$tw- 1 i) £ „, 

= (x, W-^tW- 1 !),! = (x, W- 1 ^ ' ' 2 ! 

= l|w- 1/2 s||2„. 

If we drive the system in state or phase space with minimal force H^mml! 2 < 
1, the corresponding region in state or phase space is a solid ellipsoid in W l . 
This set, depicted in FigureEfa), corresponds to {x £ R™ : x"*'W~ 1 x < 1}. 

1 /2 

This ellipsoid is also specified by {x £ R n : x = W c z, ||z||k« < 1}. While 
|| W c ' x\\ z measures energy expenditure, - — " = x ^ measures a 
state's controllability. This confirms the intuition that states corresponding 
to small eigenvalues of W c are the least controllable. 

Physically, the oberservability operator produces a response given an 

8 This is provided that the system is controllable and the dynamics of the system are 
restricted to satisfy Q with D = 0. 
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initial condition. What does this reveal about the inverse problem of re- 
constructing the initial conditions from measurements of the system's re- 
sponse? Mathematically this problem is posed as determining 

min \\y - ^ o x\\ 2 c p 10 t] . 

When y is in the image of \I/ , it is possible to precisely specify the initial 
conditions, x. Otherwise, the initial condition minimizing \\y — ^f o x\\^. P , T ,, 

given an arbitrary y G C^^T], is x opt = W^ 1 ^^. In order to ob- 
tain a quantitative measure of observability, we need only consider the 
outputs, y — ^ x. Immediately we recognize that 1 1 2/ 1 1 [o t]/H^Hr" = 

|| Wo ^IIk" |||x|i<i measures a state's observability. For instance, initial 
conditions, x, corresponding to small eigenvalues of W„ elicit smaller re- 
sponses than other states and, consequently, are less observable. If noise 
is present, responses resulting from such initial conditions would not be 

1/2 

observed at all. The set {x € R™ : x = W c z, ||z||k« < 1} corresponds 
to the observability ellipsoid depicted in Figure Hfb). The directions along 
which the the ellipsoid is long are the most observable. 

The utility in considering controllability and observability separately is 
that they have precise and experimentally relevant interpretations. A prob- 
lem with this approach is that it initially obstructs the path to ascribing 
measures of response to physical states. For instance, it is possible to model 
reduce based on either controllability or observability 9 . Unfortunately, 
the measures of controllability and observability are not unique. This is 
transparent after considering how W D and W c transform under similarity 

transformations to the system. W G transforms as W D W Q = R^WqR, 
while W c transforms as W c W c = R _1 W c (Rt) _1 . Thus, the grami- 
ans are not invariant under an arbitrary linear coordinate transformation. 

However, W C W transforms as W C W R - W C W = R^W^R un- 
der a similarity transformation of the system. The eigenvalues of W C W Q 
are invariants of the system, are intimately related to the Hankel operator 
of the system, and thus will prove invaluable for producing reduced-order 
models. 

2.3 The Hankel Operator, Balanced Realizations, and 
Model Reduction 

The theory of model reduction is closely related to that of system realiza- 
tions. In model reduction, the goal is to find realizations (i.e. the matrices 

9 It has been shown in [I] that reductions based on the proper orthogonal decompo- 
sition (POD) is essentially equivalent to model reducing based on controllability. 
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(A, B, C,D)) with minimal state dimension that approximately capture 
the system's input-output characteristics. Often being "near" the origi- 
nal system is enough to dramatically reduce the number of internal states 
needed to model the system. Here distance or "nearness" is defined by the 
standard induced-operator norm. For example, given an operator S that 
acts on £2, the induced norm takes the form ||i5||,c 2 ,i = su P||t,|| z;2 <i II^II/^j 
where v is a vector in £ 2 . Also, supposing that S approximates S, we will 
be considering two measures of the error, the absolute error \\S— S\\c 2 ,i and 
the relative error \\S — S\\c 2 ,i/\\S\\c 2 ^. The relative error is more appro- 
priate for unbounded operators, since most approximations are asymptotic 
estimates. The relative error is also the noise-to-signal ratio. 

It is useful to note that there is an explicit expression for the induced 
£ 2 norm for bounded (stable), LTI, causal operators. Such operators are in 
the space Hoo ■ The primary difficulty with this formula is that it is difficult 
to use both numerically and analytically. To motivate the formula, recall 
that an arbitrary mxn matrix M can be decomposed as M = USV* (i.e. 
the singular value decomposition) where U and V are respectively to x m 
and n x n unitary matrices and £ is only nontrivial along the diagonal. 
The diagonal values = Uj(M) > are called the singular values of 
M. If to > n then the singular values are the eigenvalues of VMtM, 
otherwise they are the eigenvalues of vMMt. Here ||M||c«»,j = er max (M) 
where cr max (M) = maxjOj(M) (i.e. the largest singular value). For a 
bounded, LTI, causal operator S such that S(u) = J^^fCsit — r)u(r)dr, 
Il<5||£2 j = su Pwgr °max(£s (<*>)) where ICs(uj) is the Fourier transform of 
Ks(t)' 

Coarse graining and model reduction are intimately related. While 
both are reduction methods, coarse graining emphasizes the spatial nature 
of reductions. Model reduction, as it will be presented here, emphasizes 
input-output resonances and approximating the response. Our approach 
to coarse graining is to identify the best way to model reduce, based on 
the response, and then ascertain the spatial structure of the reduction. 
The latter topic is elaborated upon for linear oscillator systems in the next 
section. Unfortunately, the former issue is a mathematically unresolved 
problem. The control/interpolation theoretic statement of this problem for 
stable systems (in the induced £2-norm) is called the Hoo model reduction 
problem. 

Definition 2.3.1 (Hoo Model Reduction Problem) Given a bounded, 
LTI, causal operator G of McMillan degree n, such that G : £2 —> £2, find 
m ^deg(G)<fc \\G ~~ G\\c 2 .i f° r G a, bounded, LTI, causal operator and k < n. 

Fortunately, enough is known about a related problem, the Hankel norm 
model reduction problem, to provide error bounds to the above Hoo prob- 
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lem. By using results from Hankcl operator analysis and Hankel norm 
model reduction, we will be able to deduce physically-important results 
about coarse graining. 

• The Hankel operator 

The input-output picture corresponds to an experimental situation, al- 
beit a complicated one. The full input-output operator, G, from y = Gu 
represents continuously driving and measuring a system. This operator is 
difficult to study because it does not separate observation from driving. As 
will become apparent, the Hankel operator is the part of the input-output 
operator where the operations of measurement and forcing are separated. 

To facilitate analysis, it is convenient to decompose £2[— T, T] into 
C,2[— T, 0] ffi £2(0, T], in other words, a causal (analytic) decomposition. 
LTI causal systems can be visualized in the following way: 
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(13) 



Causality implies that G12 = 0. The additional constraint that the system 
is LTI implies that JC(t,r) is purely a function of t — r. Tg and Tg are 
Toeplitz operators, while Yq is a Hankel operator. It is not vital nor 
required for the reader to be familiar with Hankcl or Toeplitz operators. 
The interested reader is encouraged to consult [T7| fTHl ITHl I^Tl 1^1 1^1 fTKl IT??] 
to learn more about these operators. If we denote the projection operator 
onto C 2 [0,T] by P + , then P| = P + and T G = P+G 



-T,0] ' 



Since such 

< 



projection operators can never increase the norm, it follows that \\Tg 
\\G\\. Similarly, ||7g|| = ||7g|| < ||G||, where the first equality arises since 
Tq and T3 differ only by time reversal. A somewhat surprising fact is that 
II^gII = l|C|| J7j. Model reduction based on Tg is equivalent to the full Hoc 
problem for stable systems. Unfortunately, the experiments represented by 
Tg involve simultaneously driving and observing. 

The Hankel operator, Tg, accepts inputs driving the internal state to 
some xq at time t — 0. Subsequently, the system is measured as it evolves 
in time. The separation of driving and measurement allows for Tg to be 
factored in the particularly convenient way: 



T G (u) 



= P+Ce At 



P+ G\c 2 [-T,0] U 



P+ J° T Ce A ( t - r )Bu(r)dr 

- AT Bu(T)dT = ^ ^ C U. 



(14) 



The Hankel operator may be factored as a product of the observability 
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operator and controllability operator: = ^ ^c- It follows that 

l|r G |ILi = H r G r GlU 2 ,« 

= ||w Wc|kv = \\VWoW- c ^ n . {L0> 

In fact, if the system is controllable and observable, the entire nonzero 
spectrum of r^r^ can be obtained. 

nonzero squared singular values of To — nonzero eigenvalues of TqTq 
= nonzero eigenvalues of \J , J\f r £\f r \f , c = eigenvalues of \f r £\I/ \I f c \I f J 
= eigenvalues of W G W C = eigenvalues of W C W D 

(16) 

The singular values of Tq are called the Hankel singular values (HSV). 
The nonzero HSV are the eigenvalues of -\/W c W , the set of invariants 
mentioned at the end of the previous subsection. 

The Hankel norm model reduction problem is useful for finding bounds 
for the full TL^ model reduction problem. In particular, it establishes the 
HSV as a measure of response. 

Definition 2.3.2 (Hankel Norm Model Reduction Problem) Given 
a rank n Hankel operator Tq corresponding to a stable, causal, LTI system 
G, such that T G : C?[-T,0] -» C p 2 [0,T], find inf rank(f) < fe ||r G - f|| £2)i for 
r a Hankel operator and k < n. 

In order to motivate the solution to the above problem, we first need to 
introduce the following theorem. 

Theorem 2.3.3 Given a rank n matrix M G MP xr (n < min(p, r)) with 
nonzero singular values ordered such that oi(M) > <T2(M) > . . . > er„(M), 
for an arbitrary rank m matrix S S K pxr such that m < k < n, 

<w(M-S) >a fe+1 (M) (17) 



Combining l|13(l and l|15|l and Theorem 12.3.31 leads to the next theorem. 
This is fundamental to this paper, for it solves the Hankel model reduction 
problem. 

Theorem 2.3.4 Given a rank n Hankel operator Tq with nonzero singular 
values ordered such that (Ji(Tg) > ^(I^g) > ■ • • > 0Vi(rG)j for an arbitrary 
rank k Hankel operator T such that k < n, 

\\r G -ru 2 ,i>(T k+1 (r G ) (is) 
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A limitation of this theorem is that, for finite dimensional systems, there 
does not always exist a Hankcl operator that makes the inequality an equal- 
ity. 

When equation ifl^jl is combined with Tq = P + G\c 2 [-t,o] > we obtain a 
lower bound for G. 

For order(G) = n (i.e. McMillan Degree n), . . 

For all G of order k < r, \\G-G\\ Ca ,i > <T r +i(T G ) ' 

An interpretation of this lower bound is that the best possible r th -order 
approximation to the input-output behavior of the system is at least a 
"distance" cr. r+ i(rG) away from the exact response. It implies that any 
reduced order model that projects out states corresponding to large singular 
values is necessarily a worse approximation than a model that projects out 
small singular values. Thus, states corresponding to large singular values 
contribute the most to the system's response. 
• Balanced realizations 

It has been shown that the nonzero HSV correspond to the eigenvalues 
of VW C W . This suggests that the Hankel operator is related to the 
system's controllability and observability. This connection is important 
for many reasons. Firstly, interpreting the Hankel operator in terms of 
observability and controllability aids intuition. Secondly, the gramians' 
eigenvalues are not invariant under coordinate transformations, so we still 
lack unambiguous measures of controllability and observability. Lastly, 
as we can see in Figure |2 controllability and observability may not be 
correlated. For generic realizations, observability and controllability are 
not on the same footing and consequently this leads to further ambiguity. 
Should model reduction be based on observability or controllability? 

The resolution to these problems relies on determining the most suit- 
able coordinates. This is equivalent to ascertaining the proper way to coarse 
grain the system. Our freedom in the choice of coordinates allows us to find 
a coordinate transformation, T, such that, in the new coordinates, the con- 
trollability and observability gramians are equal and diagonal. The reader 
may note that this procedure is essentially the same as is used in filtering 
theory. This aligns the observability and controllability ellipsoids, thereby 
putting controllability and observability on the same footing. Furthermore, 
in these balanced coordinates, W c = W D = S where X is diagonal and has 
the same eigenvalues as \/W c W (ordered from largest to smallest). The 
eigenvalues of the balanced gramians are the nonzero HSV, invariants of the 
system. The resulting system realization is known as a balanced realization. 

T can be constructed using the following algorithm [23j • By definition, 
there exists a coordinate transformation S such that S _1 W C W S = X 2 . 
Now supposing that S = W„" 1/2 R for some R, R^Wj^W^J^R = £ 2 . 
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Hence, Wy 2 W c W;!/ 2 is Hermitian and similar to £ 2 . Provided £ docs 
not have degenerate eigenvalues, there exists a unique unitary matrix U 
such that UtWo /2 W c Wo /2 U = £ 2 . This means that 

(s -i/2 U twy 2 )w c (s- i /2 U t w i/2 ) t = s . 

Thus, remembering that W c transforms as W c — —> W c = T _1 W c (Tt) _1 , 

if we let T _1 = S _l / 2 U 1 "Wy 2 , we have found the desired coordinate 
transformation. It follows that: 



TtW T = (S 1 /2 U tW- 1/2 )W (W- 1/2 US 1 /2) 



(20) 



In general, if a system is not controllable and observable, such a T (bal- 
ancing transformation) does not exist. It is possible to find a balancing 
transformation such that: 
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£ 2 





(21) 



where £,£i, and £2 are all diagonal. £ is the matrix of HSV and the 
corresponding subsystem is controllable and observable. The subsystem as- 
sociated with Si is controllable and unobservable, while the one associated 
with £ 2 is uncontrollable and observable. 
• Balanced truncation 

Now we possess the tools to generate reduced-order models. The re- 
duction technique in what follows is called balanced truncation. We assume 
that the system is stable, controllable and observable, and has been trans- 
formed into a balanced realization. Additionally, since the system is stable, 
we consider the problem over an infinite time horizon (i.e. T — > 00). 

Given a system satisfying the above assumptions, decompose the matrix 
of ordered HSV £ (ordered from largest to smallest) such that the first r 
eigenvalues form the matrix £ l ■ The remainder form the matrix of smaller 
singular values £5. Decompose £l and £5 such that they have no common 
eigenvalues. The realization for the full system takes the form: 



A = 



A L A12 
A21 A s 
C= [ C L 



B 

C s 



(22) 



By projecting out the states corresponding to £5, the remaining three ma- 
trices, (Al,Bl,C_l), form an r-dimcnsional realization that approximates 
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the original system. Denote the input-output operator for the reduced sys- 
tem by G r . This realization, by construction, is stable and balanced. Its 
HSV are the eigenvalues of S^. In additional, the approximation error is 
given by: 

fe 

||G-G r |U 2 , i <2^<7f t , (23) 
i=i 

where {af^ st : 1 < j < k} is the distinct HSV in S s . 

These techniques are extended to unstable systems in Appendix^ This 
makes it possible to use these techniques on linear, conservative systems. 
A particularly relevant result is: 

Theorem 2.3.5 (Lower Bound) Given a LTI, causal system G with n 
dimensional minimal realization (A, B, C). If there exists an "a" such that 
— al+A is a stable system matrix then for any order r (or less) approximant 
G r 

\\G- GV|U 2[0 ,T],i > (1 - e - 2QT ||e AtT e AT || c , v K + i(a) 

It is the subject of the next section to apply these methods to general 
oscillator systems, whereupon, when combined with the spatial content of 
the reductions, specifies how to coarse grain. 

3 Reduction of Oscillator Systems 

The standard form for the equations of motion generated by a quadratic 
Hamiltonian with 2N phase space degrees of freedom is given by 



Q 




f2 " 
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. P _ 




n o 







(24) 



where £1 is a N x N positive definite matrix. These systems are typically as- 
sociated with coupled harmonic oscillators. Furthermore, these systems are 
considered trivial because when expressed in normal modes, the resulting 
oscillators are decoupled. Decoupled oscillators are considered noninteract- 
ing. This view is correct for isolated systems, however, for open systems it 
is not. 

The coordinates that capture the original experimental configuration is 
of exceptional interest. This is because the important coordinate system for 
model reduction is the one in which the gramians are balanced. Now while 
in balanced coordinates the gramians are diagonalized, the matrices A or 
CI need not be. This illustrates that a generic open system, even a linear 
one, typically has interacting internal states. The statistical mechanics of 
quadratic Hamiltonians is invalid unless the system is driven. This follows 
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since when expressed in normal modes, the system is noninteracting and, 
hence, not ergodic or mixing |131 114j. The usual heuristic argument for 
justifying the practice is that phonon (oscillator) systems do not truly have 
a linear dispersion; the systems themselves are nonlinear. The nonlincarity 
is responsible for the mixing of states. This is precisely the issue that 
spawned the Fermi-Pasta-Ulam (FPU) problem 10 . Once the nonlinearity 
from the heuristic argument is associated to a disturbance of the form 
Bw(t), then the analysis in this paper agrees with the heuristic argument. 

The advantages of using balanced coordinates rather than modal coordi- 
nates reflect the sensitivity of the system to the choice of experiment. This 
sensitivity to experiment suggests that it is more appropriate for oscillator 



systems 11 to investigate open systems. 



Zl 

Z-2 



= Az + Bu = 

y = Cz 
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Zl 


-n 2 o 




Z2 



Bu (25) 



In this coordinate system, z\ represents the spring displacements while z^ 
represents the corresponding velocities. 

There remains the question of which experiments should be considered. 
Conceptually, B varies the gains of the driving. B determines how accessi- 
ble particular states are to driving. Alternatively, allowing for Dirac delta 
driving and setting zq — 0, the driving may be used to prepare the sys- 
tem's initial conditions. With this interpretation, B prescribes how initial 
conditions are weighted. Similarly, C indicates which and how easily inter- 
nal states are measured. We exclusively consider oscillator systems with 
uniform constituent sizes and masses. This choice, motivated by thermody- 
namics and the equipartition of energy, implies that positions and momenta 
arc treated equally. This is equivalent to assuming that the position and 
momentum of each particle can be driven with equal gain. Hence, B = 61, 
where b is a constant. Considering each position and momentum equally 
difficult to measure corresponds to setting C = cl. Now let b = c = 1. 
Mathematically this choice is natural since the resulting input-output op- 
erator (in the Laplace domain) is the full system's resolvent, (si — A) -1 . 

Fixing B and C dictates the type of experiment. However, it does not 
fully determine the experiment. For the problem to be well-posed, we also 
need to specify the duration of the experiment. This is important because 
the exact form of the experiment fully determines its input-output charac- 
teristics (i.e. the input-output operator). Different experiments give rise 

10 It is interesting to note that coupled, nonlinear, anharmonic oscillator systems are 
not guaranteed to mix. This fact has been attributed to the existence soliton solutions 
to the equations of motions. 

11 At least oscillator systems with uniform masses. 
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to different measures of response. This is illustrated in Figure 03 As a re- 
minder, Hankel singular values (HSV) tell us how much their corresponding 
states contribute to the response. These states are roughly the system's 
input-output resonances. 




n n 



Figure 3: A plot of the ordered Hankel singular values (HSV) for the ho- 
mogeneous, harmonic oscillator chain. T is the time over which the system is 
investigated. The HSV in (a) are plotted for T oc N 1/2 , while T x N 2 for (b), 
where N is the total number of masses in the chain. In each case N = 49. 

Figure E^b) depicts how different types of experiments on an oscillator 
system, over the same duration, T = N 2 , give rise to different normalized 
HSV and, consequently, different measures of response. Figure^a) displays 
the same types of experiments, but with T — N 1 / 2 . Clearly, the normalized 
HSV for two of the experiments have completely changed. Figure |3{a) 
demonstrates that experiments over the shorter time frames tend to admit 
lower-order reduced models. Hence, time scales have an enormous impact 
on model reduction. 

Although experiments over short time horizons directly lead to very 
low-order reduced models 2 , our intent for introducing a finite cutoff time 
is to regulate divergences. The divergences arise due to the fact that the 
HSV become infinite in the infinite-time horizon limit. This is discussed 
in more detail in Appendix Now with the divergences regulated, we 
return to addressing how to coarse grain physical systems. From above, 
we learn that this is not a well-posed question since different time scales or 
experiments lead to different reductions. A well-posed reduction problem 
requires specifying the type of experiment and the time scale. It is natural 
to expect that past a certain time scale (i.e. past thermodynamic equili- 
bration), there is a unique way to coarse grain. It is for this reason that we 
approximate the response of the system in equation l|25|l with B = C = I 
over a finite yet long time horizon. Interestingly, not only does there exist 
such a time scale, but some Hankel norm results determine precisely the 
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coarse grainings. These topics comprise the following subsections, the first 
in which we work without restriction on the form of fi other than that it is 
positive definite. In the second subsection we consider the case of the ho- 
mogeneous linear harmonic chain, and lastly we treat some heterogeneous 
linear oscillator chains. 



3.1 General Oscillator Systems 

To determine the best possible coarse graining or at least near the optimal 
coarse graining, we will proceed to use the Hankel operator machinery to 
obtain bounds for \\G — G r \\c 2 [o,T],i- Recalling the control theory tutorial, 
this provides us with a criterion for model reduction. In particular, the 
lower bound, 

\\G-G\\c 2 ,i>a r+1 (T G ), 

where oy+i(Tg) is the (r + l) th HSV, confirms that the states with large 
HSV contribute the most to the response. The upper bound, 

k 

||G-G|| £2 , l <2^af; st , 
j=i 

where {of. lst : 1 < j < k} is the set of distinct HSV with ij > r, ensures 
that our approximations are controlled. The first step is to determine the 
HSV that provide these bounds. However, in doing so we also find the 
balancing transformation. This makes it trivial to truncate the system and 
obtain reduced-order models. 

Determining the HSV requires calculating the controllability and ob- 
servability gramians. Unfortunately, restricting attention to a finite cutoff 
time complicates the analysis. For instance, W c and W D satisfy differential 
equations (see equations <|10l) and i|ll|) ) instead of Lyapunov equations. A 
method of simplifying the analysis involves investigating the system ma- 
trix — al + A over an infinite time horizon instead of A over a finite time 
horizon. This procedure is known as exponential discounting. Intuitively 
"a" should be on the order of the inverse time cutoff for the approximation 
to be any good. Fortunately the above intuition can be made much more 
rigorous, thereby keeping all approximations under control. 

For the systems under consideration, the gramians are formally given 

by 

W c = J o T e At e AU dt W { c a) = J °° e~ 2at e At e AU dt, 
W = J Q T e AU e At dt Wi a) = r o °° e~ 2at e AH e At dt. 

A property of these gramians is that W C W Q or w[°'wi°' are always 



(26) 



similar to a matrix of the form 



M 
Mt 



This means that there is 
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an exact duplicity in the HSV independent of "a". In fact, under the 
transformation R defined in Appendix iBl A takes the form of the system 
matrix in equation 1)24)1 . In this basis, we find that W c W d where 



W, 



j_ 

4« 



n + n 
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0{a 2 ) 







ci- 2 - 



n- 2 -I + 0(a 2 





f n- 1 + 0(a 2 ) 
I + 0{a 2 ) 




(27) 



In this basis the gramians are almost balanced. Provided we transform the 
system by a unitary transformation, U<j, to diagonalize ft (i.e. fl — 4 An) 
and we take "a" sufficiently small (sufficiently long-time horizons), the 
gramians are balanced. The precise interpretation of "sufficiently small" is 
outlined in Appendix |0 We want to require that "a" is small enough so 
that the off-diagonal terms do not change the ordering of the HSV. Here we 
assume, without loss of generality, that the eigenvalues of An are ordered 
from smallest to largest. Under the previously mentioned conditions to 
0(a 2 ) the balanced gramians (balanced up to permutation) take the form 



W c = W = — 

4a 



O(a) 



0(a 2 



0(a) 



0{a 2 ) 

These balanced gramians are associated with the linear system 
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(29) 

for all j < k, let a be a permutation such 
Mj) (fi) + A-Jj(n) < A a(fc) (n) + A-JjCfl) for aU k < j. Trivially, 
via a unitary transformation, the gramians in equation l|28|) may be fully 
balanced (the HSV are ordered). In this case, we find that 



Given that Xj(fl) < Afe(J2) 
that A r 



1 



<7 fc (o) - - (A r ^, 1 (n) + A"^ («)) + O(l). 



(30) 



The degeneracy of the HSV suggests that any balanced truncation that 
keeps states corresponding to the first r = 2q HSV will remain conservative. 
In fact, we can see this by inspection of equation 1)29(1 . With this in mind, 
we will consider only truncations that keep an even number of states. The 
immediate consequences of these results are that we obtain bounds on the 
approximation error of the response. 

- 2aT )a 2q+1 (a) 



\G- 
1 



G 



2q\\C 2 [0,T], 
-2aT 



> 1 



(i-e- 2aT )(x a{q+1) (n) + \-} +1) (vi)), 



(31) 
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and 

w-i 

\\G-G 2q \\ C2m <2e aT Y / ^ 3+ i(a) 

(32) 

Additionally, by using linear-matrix-inequality (LMI) techniques a 
substantially tighter upper bound can be established. The improved bound 
is 

\\G-G 2q \\ C2[0iThl <2e aT a 2k+1 (a) 

A remarkable aspect of these oscillator models is that, over a long time 
horizon, the relation between the system's spectrum and the gramians is 
simple. Despite this simple relationship, these results also establish that 
the set of best reductions (i.e. those that satisfy the lower bound) need 
not be modal reductions! Modal reductions explicitly neglect (project out) 
the system's fast dynamics. For instance, let us suppose that Xj(Q) > 1 
for all j; disregarding degeneracy, the HSV are automatically ordered from 
smallest to largest. This implies that projecting out small HSV eliminates 
states corresponding to slow modes! This is contrary to what is typically 
done. Alternatively, suppose that Aj(fi) < 1 for all j. In this instance, 
disregarding degeneracy, the HSV are ordered from largest to smallest. 
This is precisely when modal reduction is appropriate. Lastly, when the 
eigenvalues of ft are both greater than and less than one, the appropriate 
reductions involve a mixing of fast and slow modes. 

In the basis that produces the realization in equation l|29l) . the system 
is balanced, and yet l~i is diagonalized. Although this means that such 
systems are noninteracting, not all internal states are treated equally. In 
fact, in the case of the linear harmonic chain, the weighting of the gains 
has a physically meaningful interpretation that will be elaborated upon 
in Section 13.21 Also, there is an enormous degeneracy in the types of 
experiments producing equivalent reductions. The same reductions result 
from using B = V and C = U where V and U are arbitrary unitary 
matrices. This may not come as much of a surprise since requiring V and 
U to be unitary causes the internal states to be treated equally. We see 
that there are an infinite number of incquivalcnt realizations that yield the 
same reduction. 

These results also reveal how to choose "a". By varying "a" we may 
refine our bounds. Generally, we have an LTI, causal system that achieves 
or almost achieves the lower bound. Consequently, it is useful to choose 
"a" such that the lower bound is at its maximum. The maximum occurs 
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as a — > 0. 



||G-G 2g |U 2[0 , T ],i > |(Aa( 9+ i)(n)+A^ +1) (J2)) (34) 

For the lowest upper bound (from balanced truncation) , 4- [2e oT (J2k+i (a)] = 
implies that a = T _1 . Therefore, the minimum upper bound is 

eT 

\\G-G 2q \\ CMi < T (A a(g+ i)(n) + A^ +1) (n)). (35) 

As we approximate G over an infinite-time horizon (T — > oo), both the 
norm of G and the absolute error diverge. This divergence is unavoidable 
and generally independent of the number of particles (oscillators) in the 
system. However, there is another possible interpretation if the number 
of particles, N, is large. For our analytics to be valid, there must be 
restrictions on the size of "a" . Combining equations Ij71|) and l|73|) from 
Appendix [U] we are able to relate the maximal "a" to the frequencies 
of the system, Afe(f2). In most cases, as N gets large, Afc(fl) ~ N~ Sl . 
For instance, in the case of the homogeneous linear harmonic chain, for 
small wave number, Si = 1. Consequently, "a" is forced to fall to zero 
asymptotically like a ~ N~ 52 for some S2 > as N — > 00. Reciprocally, 
if "a" tends to zero faster and, consequently, T tends to infinity, there 
is no restriction on N . The divergence is due to the infinite time horizon. 
Suppose that "a" is parameterized by N and we consider the N — > 00 limit. 
In this case, the divergence is due to the infinite-particle limit. For oscillator 
systems, like photons and phonons, the divergence is attributable to the 
absence of a mass gap (i.e. the eigenvalues of ft become dense near zero). 
Thus, there is no inherent length (mass) scale for the system. This is one of 
the simplest divergences, a long wavelength divergence. Depending on the 
structure of An, however, there may also be a short- wavelength divergence 
or even possibly a mixed-wavelength divergence. Had we investigated yet a 
shorter time scale, still taking T — > 00, the resulting reduced-order systems 
are typically dissipative E] ■ Physically this is a manifestation of how 
fluctuations may induce time scales IT4*| . 

While the divergence in ||G||£ 2j i and \\G — G r \\c 2 ,i has been explained, 
its consequences have not. Since the error estimate diverges, except when 
regulated (i.e. by considering finite T), the absolute error is not a meaning- 
ful quantity. Any long-time approximate is asymptotic at best. This means 
that the (regulated) relative error, limT^cc \\G — G r \\c 2 [o,T],i/\\G\\c 2 [o,T],i> 
is much more useful. Combining equations (|31|l and il-il'l) yield rather con- 
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servative bounds: 

r \\G-G 2q \\ C2[Q , T] ^(WdW + A; 1 (O)) 

lim r— ; — — > lim 



T- 



G\\c 2[ o,tu T ^ f Ef=iAa (j )(n) + A-^(f2) 

V)( fl ) + 4i)( n ) 



and 



lim l|G-g2 g || £a[ o,n,i < lim f Ef^+iA^W + A-^cn) 

||G|U 2[ o,T],i -t^L |(A Q(1) (n) + A; ( \ ) (0)) 



e EjLg+i Ag(j) (n) + (n) 
(Wnj + A^n)) 



(37) 



Comparatively tighter bounds are obtained by using equations (|34|) and 
(|35|l . These bounds are 

Um ||G-G2 g ||£ 2[ o,n,i > + f38) 

^ ||G|| £9[0 ,ri )< " e(A Q(1) (n) + A; ( 1 1) (n)) ' 

and 

nm ||G-G 2 ,|| £3[0 ,r],« < e (*g( q+ i)(n) + Ktg+pW) m 

I|g|| £2[0 ,t],,: - x a{1) (n) + x-^(n) 

These general cases have allowed us, for instance, to determine condi- 
tions when modal reduction is appropriate. Without knowing more about 
fl it is not possible to discern the spatial content of the reductions. Without 
the spatial content we cannot specify the relationship between reduction 
type and coarse graining. In the following section, we will apply the above 
results to the linear harmonic chain, from which we determine the appro- 
priate coarse grainings. This example will clarify the relationship between 
system reductions and coarse grainings. 



3.2 The Linear Harmonic Chain 

The models that we consider in this section are all variants of the one- 
dimensional linear harmonic chain depicted in Figure 0| The system con- 
sists of a chain of N equally spaced masses each with mass m connected via 
N +1 springs. The chain is connected on each side to stationary walls. We 
will first treat coarse graining the linear harmonic chain with homogeneous 
springs in great detail. Briefly, we also present how to coarse grain some 
heterogeneous chains. We will conclude by comparing the different models 
and their respective coarse grainings. 
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Figure 4: Linear chain of oscillators with fixed boundary conditions. All of 
our models are of this form. The different variations have homogeneous, layered, 
and randomly, uniformly sampled spring constants. 
• The homogeneous chain 

Each spring of the homogeneous linear chain has a spring constant, k. 
For this system, f2 2 has the familiar form, 



n 2 = 



m 
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0-12 



(40) 



The matrix of ordered eigenvalues of fi, An, is such that (Aq) pp = ui, 
2y — sin ( ^jy -f- i) )' Additionally, the unitary matrix that diagonal: 

■■ (u (j) ), = 



lzes 



f2, Ud, is given by (U<j) 



sin I 



Here is 



N + l y N + 

the eigenvector such that flu^ = ujjU^K Not only is Ud both orthogonal 
and symmetric; its action on vectors is almost that of a discrete Fourier 
transform. It is not actually a Fourier transform since the spatial domain 
of lattice sites is not translationally invariant. Had we considered the linear 
harmonic chain on a ring instead (i.e. the group Zjv), then the action of 
Ud on vectors would, in fact, be a Fourier transform. The main point here 
is that local spatial rescaling in real space corresponds to rescaling large 
wave vectors in Fourier space. 

Motivated by model reduction, we consider two particularly interesting 

T 



limits. For the first case, let 2\ — < 1 and N ^> 1. In the second case 

V m 

we take the mass and spring constants to be functions of iV such that 
2(N- 



k(N) 



> 



1) 



and N » 1. The former case will be discussed in 



y m(N) 7r 
detail. After that the nuances of latter case will be clear. 

In the first case, u p < 1 for all p € {1, . . . , N}. Consequently, a(p) 
Hence, the HSV are already ordered from largest to smallest. Also, 



= P- 
the 
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minimal time scale over which this analysis is valid is determined by the 
limits on "a". When "a" satisfies the inequality in equation (J73J, which 
implies "a" at least scales as N~ 2 (if not faster in iV), the ordering of the 
HSV is not altered. This gives the absolute error bounds when combined 
with the expression for u> p and equations (|34(l and l|35(l . 

\\G G 2q \\ C2[w > f (sin + (sin ($$)) 

IIG - G 2q \\ CMi < f (sin + (sin ($$)) 

= ^)(l + ^((^) 2 )) 

It is no surprise that the appropriate reductions project out the fast 
modes since in this limit the dispersion is linear. In the limit of large N, 
truncating fast modes is the same as projecting out large wave numbers. 
However, as mentioned earlier, large wave numbers correspond to short 
distances. So we see that projecting out fast modes from this system is 
equivalent to locally coarse graining. In fact, the lower bound suggests 
a stronger result. Provided the lower bound is approximately achievable, 
though the reduced-order system may not be LTI, the best possible reduc- 
tions must involve locally coarse graining (modally reducing) the system. 
For example, projecting out a slow mode via balanced truncation is an 
example of nonlocal coarse graining. The lower bound of the incurred ap- 
proximation error involves the HSV corresponding to that nonlocal state. 
Since the HSV corresponding to slow (nonlocal) modes are larger than 
those of fast modes, any nonlocal approximant of the system is necessarily 
worse by equation H41jl. 




n n 



Figure 5: (a) A plot of the ordered Hankel singular values (HSV) for the 
homogeneous, harmonic oscillator chain. The spring constants are uniformly 
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taken to be k = 0.25. The HSV are plotted for T oc N 2 where N = 49. The 
distribution of HSV remains essentially unchanged for any larger choice of T. (b) 
A plot of the frequencies for the same system. 

The bounds in equations Ij41|) and (|42|l also, at least for this model, pro- 
vide information regarding finite size effects. If we take the lattice spacing 
to be b and the system size of the approximate system to be L = qb, the 
bounds then imply that (for N 3> 1) 

lim ]lG ;^ 2qUMi =0(L~i). (43) 

T^oo \\Cr\\c2[0,T],i 

This result is not new, though these techniques provide a new way to obtain 
it. Additionally, these techniques imply that for more general systems 
or experiments, reductions may have quite a different dependence on the 
system size. 

It is apparent from l|29|) that the balanced realization for the harmonic 

chain weights driving of the momenta by A S2 1//2 Ud. Since (Aq) pp = oj p 

this gives more weight to momenta corresponding to small frequencies. 
Therefore it requires smaller gains to activate the slower modes. If driving 
gives nontrivial initial conditions (i.e. impulses), this is equivalent to slow- 
mode initial conditions being more easily excited than fast-mode initial 
conditions. While the balanced realization of the system implies that the 
internal states of the system are noninteracting, it also implies that differing 
normal modes are not treated equally. This again suggests that the most 
natural coarse grainings are local coarse grainings. 

Consider the latter conditions mentioned earlier, 2\ / — -. — \ > — — — — - 

y m(N) ~ it 

and N ^> 1. Here uj p > 1 for all p £ {1, . . . , N} and implies that a(p) = 
N + 1 — p. The upper and lower bounds for this case are respectively given 

by 



\\G-G 2q \\ C2{W > | (cos(^) + (cos(|^)) _1 

and 

||G-G 2 J £2[0>TM <f (cos(^) + (c s(^))~ ] 

:(i-^ + 0((2±i) 4 )). 



(44) 



eNT , 



(45) 



This system is rather pathological since any good approximant must have 
q oc N, as seen in equation (|44|l . It is impossible for the error to be 
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made small unless q is of the same order as N. That q must scale as N 
implies that this system does not admit the same nice reductions as the 
previous example. Recall that for the previous example the relative error 
vanishes as N — * oo as long as q oc N s for any 5 such that < 5 < 1. 
Thus, decent reductions must retain far more states than the previous 
example. Despite this pathology, the appropriate reductions project out 
the slower modes. Since the same XJd may be used as before (i.e. essentially 
a Fourier transform) , the coarse graining keeps the small distance behavior 
(fast modes) and projects out the rest. For this system, high-frequency 
modes are more easily amplified, which explains the importance of including 
those modes in the approximate response. 
• The layered and random chains 

When Urf is not a Fourier transform then these amenable dispersion re- 
lations are not guaranteed. Without such relations, modal reduction is not 
necessarily equivalent to local coarse graining. For example, consider a sys- 
tem with uniform masses and spring constants that under a unitary change 
of basis has the same fi 2 as in 14U|) but without the spatial configuration 
of the linear chain (see Figure 01 . 

e -wuv v wvnrojwi/ v vuw>-, 




Figure 6: A spatially heterogeneous chain of linear oscillators. This is an 
example where spatially local coarse graining breaks down. 

This system is an oscillator system with nonlocal interactions and a 
spatial Fourier transform will not diagonalize fl. For this system, the ap- 
propriate reduction again would be a modal reduction since B and C only 
differ from I by a unitary transformation. The exceptional thing about 
this example is that modal reduction lumps together oscillators that are 
far apart yet directly connected to each other. Consequently, it is not a 
local coarse graining. Although this system has the same characteristic fre- 
quencies and HSV as the homogeneous linear chain, long range interactions 
disrupt the validity of local coarse graining. Despite the artificial nature 
of this example, it illustrates the relationship between heterogeneity, non- 
locality, and long range interactions. Frustration, induced by competing 
interactions, also exemplifies these connections. 
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(a) 



(b) 



spring constant = k L spring constant = t: 2 



Figure 7: (a) A layered medium, (b) A layered chain of linear oscillators. 
The scenario above is analogous to what happens in layered systems, 
as depicted in Figure d The material is homogeneous along one the ver- 
tical direction while it is heterogeneous in the other direction (transverse 
direction). The importance of such examples cannot be overemphasized, 
for they imply that the common practice of local coarse graining will not 
always apply to heterogeneous systems 12 . Figure [Bt a) depicts the HSV 
for a one dimensional layered harmonic chain, a variant of the scenario de- 
picted in Figured The spring constants are taken to be 0.125 on one side 
and 0.375 on the other. The HSV are distributed similarly to those of the 
homogeneous chain, as seen by comparing Figures Ufa) and|Hfa) or by in- 
spection of Figure llOf a). However, upon comparing Figures [^[b) and|Hfb), 
the frequencies are quite different. Additionally, the transformation that 
diagonalizes does not behave like a Fourier transform. Consequently, the 
appropriate coarse graining for this system is nonlocal. 



(a) 



(b) 



20 30 40 50 



Figure 8: (a) A plot of the ordered HSV for the layered, harmonic oscillator 
chain. The HSV are plotted for T oc N 2 where N = 49. The spring constant on 
one side is taken to be 0.125 one side and 0.375 on the other, (b) A plot of the 
frequencies for the same system. 

When the spring constants are uniformly, randomly sampled from the 
interval [0.125,0.375], somewhat surprisingly, the HSV are distributed al- 

12 Under reasonably restrictive assumptions about the nature of there heterogeneities, 
the theory of homogenization allows one to effectively locally coarse grain an inhomoge- 
neous system. 
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most identically to those of the homogeneous chain. As in the case of the 
layered chain, the frequencies of the random chain arc different from those 
of the homogeneous one. Just as in the previous case, the coarse grainings 
for this system are nonlocal. In each case, when there is greater variation 
in the values of k, the distribution of HSV start to vary more from the 
purely homogeneous case. 




n 



Figure 9: (a) A plot of the ordered HSV for the harmonic oscillator chain with 
uniformly sampled random springs. The spring constants are sampled from the 
interval [0.125,0.375]. The HSV are plotted for T <x N 2 where N = 49. (b) A 
plot of the frequencies for the same system. 



1(a) 




n(b) 




Figure 10: (a) A comparative plot of the ordered HSV for the different oscillator 
chains. The HSV are plotted for T oc N 2 where N = 49. (b) Plots of the 
frequencies for the different oscillator chains. 
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4 Conclusions and Future Directions 

Attempting to approximate systems by only considering their relevant de- 
tails is not new in physics. For instance, resumming Feynman diagrams 
corresponding to prevalent physical processes attempts to do this. Such 
methods fail since they are not systematic. This is the novelty of this 
work. Not only does it provide a systematic way of determining how to 
coarse grain an arbitrary (linear) system, it also establishes how to im- 
plement the coarse graining in an algorithmic fashion. Furthermore, it is 
complementary to both the renormalization group (RG) PUJ |^ |32 and 
the projection-operator formalism because it removes much of the ambigu- 
ity in coarse graining. Additionally, the Hankcl methods can be used for 
arbitrary linear systems, not simply for generalized Hamiltonian systems. 
However, for general systems, the gramians are not guaranteed to take such 
simple forms. 

This paper also answers an important question that was raised by Har- 
tle and Brun In that work, both the quantum and classical aspects 
of the harmonic chain (on a ring) were investigated. In particular, connec- 
tions were made between different coarse grainings and the determinacy of 
the coarsened equations of motion (in the classical case) and also between 
different coarse grainings and decoherence (in the quantum case). They 
conjectured that local coarse graining led to more deterministic equations 
of motion for the coarsened degrees of freedom than nonlocal coarse grain- 
ing, at least when the fast-mode initial conditions are thermally distributed. 
The induced noise for the locally-coarsened description was less than that 
for the nonlocal one. However, as was acknowledged in their work, the 
coarse grainings that were investigated constituted a set of measure zero of 
all the possible coarse grainings. This paper extends that work by consid- 
ering arbitrary coarse grainings and more general quadratic Hamiltonians 
than just the harmonic chain. For the homogeneous harmonic chain with 




< 1, we confirm that local coarse graining induces less noise than 



nonlocal coarse graining. Additionally, we establish that this conclusion 
does not generalize to arbitrary oscillator systems. In fact, we have shown 
that this relationship is contingent on the dispersion relation and how the 
HSV depend on the normal-mode frequencies. Consequently, for a different 
oscillator system, it is possible that nonlocal coarse grainings will yield the 
most deterministic equations of motion. 

There are many theoretical directions in which this work can be taken, 
however, the greatest pool of problems are those that relate to physically- 
motivated models. The methods introduced here should be quite useful 
when applied to any number of physical systems where local coarse grain- 
ing fails. For instance, inhomogeneous systems like layered or disordered 
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systems are prime nontrivial candidates. Additionally, this work is ide- 
ally suited for nonequilibrium systems. In particular, since it identifies the 
degrees of freedom that seem both most "excitable" and "observable", it 
may be appropriate for revealing the true nature of effective temperatures 
For granular systems, this would be a big step towards identifying the 
importance of such mysterious quantities as the granular temperature and 
the free volume. Accordingly, it is in these directions, among others, that 
future work using these methods should be taken. 
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A Infinite time horizon results for finite time 
horizon problems 

Approximating conservative linear systems over an infinite time horizon 
inevitably leads to divergences. This may be understood from the fact 
that the gramians become unbounded due to the infinite time horizon. A 
standard way to regulate this divergence is by approximating the system 
over a finite time horizon. Alternatively, the system can be exponentially 
discounted and considered over an infinite time horizon. 

In this appendix we express the upper and lower bounds for the approx- 
imation of the input-output operator over a finite time horizon in terms of 
exponentially-discounted infinite-time-horizon Hankel singular values. Al- 
though this analysis is only applied to conservative systems, we find the 
bounds for arbitrary finite dimensional systems that admit LTI, causal real- 
izations. Given a system realization (A, B, C), we denote the input-output 
operator and its order r approximant respectively by G and G r . Similarly, 
for the exponentially discounted system (i.e. with system matrix — al+A), 
the input-output operator and its approximant are denoted by and 
Gi a \ respectively. Additionally, the finite time horizon HSV are given by 
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ci > (72 > . . . > <J„ while the infinite time horizon singular values are given 
by en (a) > er 2 (a) > ■ ■ • > c„(a). 

Equation (|19|l as it is stated is equally valid for infinite or finite time 
horizons. However, we intend to relate \\G — G r \\c 2 [o,T],i to the singular 
values {cTj(a) : 1 < i < n}. The following new theorem establishes the 
relation of interest. 



Theorem A. 0.1 (Lower Bound) Given a LTI, causal system G with n 
dimensional minimal realization (A, B, C). If there exists an "a" such that 
— al+ A is a stable system matrix then for any order r (or less) approximant 
G r 

\\G G r \\ CMi > (1 - e- 2QT ||e AtT e AT || c „> r+1 ( a ) 

Proof : 

Since the Hankcl operator is simply a projection of the original input 
output operator we initially trivially find for an arbitrary G r , 

\\G - G r \\c 2 [0,T],i > ||Tg - ^G r \\c 2 [0,T],i > ||Tg - /C r \\c 2 [0,T],i, (46) 

where tC r is an arbitrary rank r operator that is not restricted to be of 
Hankel form. The last inequality arises because it is known that \\Tg — 
I 1 ,* ||c,[o.T],i ^ ^(J'g) and equality is not always possible since T G is of 
Hankel form. However, equality is achievable for an arbitrary operator JC r . 

The primary nontrivial step in this proof requires the observation that 
the each of the eigenvalues of the balanced gramian, , associated to 
r G (a> = r( a ' are decreasing functions of a. This can be seen by noting that 
for any vector £ G R™ and for b < a, 

(C, (W {b) - W {a) ^j C) > 0. (47) 

This means that Uk (a) < Cfc (b) for a > b. From this observation it is then 
follows that 

l|r G - ICr\\c 2 [0,TU > l|r (a) - ^rlktO.T],,- (48) 

Then using that for any two bounded linear operators, A and B, in the 
induced norm < and that 

||„AT|| _ II A f T AT||l/2 /- q n 

||e ||o,i - \\e e \\ Cn i , (49) 

we finally obtain 

l|r<°> - iCr\\c 2 [0,TU > ||rW - ^r|U a[0 ,oo],< - l|r (Q) - ^r|| £a [T,oo],i 

>(l_ e -2aT|| e AtT e AT|| Ciii)(Tr+i(a) , ^ 
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Hence we arrive at the desired result, 

\\G- G r |U 2[ o,T],* > (1 - e- 2aT || e AtT e AT || cv K+i(a)- (51) 
In the case where A has no Jordan blocks the above result simplifies to 

\\G - G r \\ C2[Q , TU > (l-e 2TmaxRoA (- aI+A V+i(a)- (52) 

□ 

This lower bound is completely consistent with (|19|1 for stable systems. 
Although we only use the upper bound in determining the relative error, 
we include it in the following. Recall from equation (|23[) that the upper 
bound is only valid for stable systems (for infinite time). In fact, it is almost 
exclusively proven for stable systems in the literature. The following upper 
bounds, valid for finite time horizon, are taken from [Q. 

Theorem A. 0.2 (Upper Bound) Given a LTI, causal system G with n 
dimensional minimal realization (A, B, C), if "a" is such that — al + A is 
a stable system matrix and ||G^ a - ) ||£ 2! i < 7, then: 

\\G\\c 2 [o,tu <je aT 

Proof: This results follows from the differential version of the bounded real 
lemma. For details refer to pp. 

□ 

We already know from equation ill'-ll) how to obtain an upper bound 
for the approximation error, ||G' a - ) — G^\\c 2 .i- We combine this fact with 
Theorem IA. 0.21 to obtain the following corollary. 

Corollary A. 0.3 (Upper Bound) Given a LTI, causal system G with n 
dimensional minimal realization (A, B, C), if "a" is such that — al + A is 
a stable system matrix then there exists an order r input-output operator, 
G r obtained by balanced truncation such that 

k 

l|G-G r || £2[ o,T]^<2e QT ^af; st (a), 

3=1 

where cr^ lst (a), 1 < j < k are the distinct infinite time horizon HSV from 
the set {cr r+ i(a), . . . , er„(a)}. 

To be precise, an algorithm to obtain G r is as follows. First find the real- 
ization for G^ by truncating the balanced realization of (— al + A, B, C). 
Denote the resulting realization by (A r ,B r ,C r ). A realization for G r is 
then just (al + A r , B r , C r ). 
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B Calculation of the Gramians 



In this appendix we intend to calculate the balanced form of the gramians 
for oscillator systems (i.e. of systems of the form found in equation l|25|0 . 
This entails calculating the damped (exponentially discounted) gramians: 



W< o) = / °° e' 2at e At e Ah dt 
W ( a) = e- 2at e A>t e At dt 



(53) 



However, first let us introduce the following notations and conventions. 
Recall that any matrix, S, may be expressed in terms of the canonical 



matrix units, e^. In other words, 



s = $>. 



where each $y is just a complex number. For instance, in the case of 2 x 2 
matrices, 

" 1 





ei2 



Additionally, for this section, Q = 
the algebraic tensor product, ® 13 ■ 



1 

-1 
For instance, suppose 

An A 12 
A21 A22 



Lastly, we frequently use 



then 



A n B A 12 B 
A21B A22B 



A® B = 

First note that if we define 

R = en (g) 0~ 1/2 
then easily it follows that 

A = ei2 ® I - e 2 i <8> ft 2 = 

-5^R 1 AR = Q<g>0 = 

From this, one then finds that 



e22 



I 

-n 2 
n 
n 



W { c a) = R/°° e-^e^tR^R^e-Q^dtR 



10 

,-2at Q®f2i 



n 
rr 1 



-Q®nt dtR 



(54) 



(55) 



(56) 



13 Often referred to as the dyadic product. 
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However, using that e^® * = I ® cos fit + Q <£> sinfl we finally arrive at 



W ( c a) =R/ f 



00 „-2otv 



flcos 2 fli + f2 _1 sin 2 fit 



ifl-^fl" 1 - fl)^sin 2 fit 
n -1 (n _1 -f2)^ sin 2 fit fi" 1 cos 2 fit + fl sin 2 fit 
fl + o 1 
fi + fr 1 

-a 2 f|- x (I - fl 2 )(a 2 I + O 2 )- 1 o(I - fl 2 )(a 2 I + fl 2 )" 1 
a(I - fl 2 )(a 2 I + O 2 )- 1 a 2 f2- x (I - fl 2 )(a 2 I + fl 2 )" 1 



dt R 



R 



fRx 

4a 



R. 



Similarly for the observability gramian we obtain 



(57) 



W { a) = f R" 1 

u 4a 



fi + fi- 1 
fi + fi- 1 



R 



f R- X x 

4a 



a 2 f2- 1 (I-f2 2 )(a 2 I + fl 2 
a(I- fl 2 )(a 2 I + fl 2 )- 



o(I- f2 2 )(a 2 I + fl 2 )- 1 
-a 2 f2- 1 (I-f2 2 )(a 2 I + fl 2 ; 



R 



(58) 

From equations (|57|l and 1581 it follows after using to diagonalizc fl 
and taking the small "a" limit that the balanced gramian, without ordered 
eigenvalues, is given by 



W ° = ~a 



0(a) 



(a 2 ) 



0(o) 



C Restrictions on the time horizon/exponential 
discounting 

C.l Relevant matrix perturbation 

We start with a matrix L f of the form, 



L f = 



M eN 
eN M 



(59) 



The question we intend to address is, how do the e terms perturb the 
spectrum of Lq. 



det(AI-L e )=det^ __ eN ^ _ M 

AI-M -eN 

AI-M-e 2 7V(AI- A^iV 
= det(AI - M)det(AI M e 2 iV(AI - M) -1 N 



det 



(60) 



(det(AI - M)) det (I - e 2 ((AI - M^N) 
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However, recall that given an invertible matrix P, 

P- 1 = (det(P))" 1 adj(P), (61) 

where adj is the formal matrix adjoint. Combining (|61|l with the fact that 
e < 1, we obtain: 

dct(AI-L £ ) re (dct(AI-M)) 2 -e 2 Tr((adj(AI-M)N) 2 ). (62) 

Now, let us evaluate the above at an eigenvalue of L e that is "near" an 
eigenvalue of Lo. Thus, A = Ao + A(e), where Ao is an eigenvalue of Lo 
and A(e) — > as e — > 0. First consider the transformation U that has the 
property that 

where r > 1. From this we find 

dct(A I - M + A(e)I) = A r (e)det(A I - M)det(I + A(e)(A I - M)" 1 ) 
= A r (e)det(A I - M) [l + A(e)(det(A I - M)) _1 Tr(adj(A*I - M)) 
+0(\ r+2 (e)) = A r (e)det(A I — M) + A r+1 (e)Tr(adj(A I - M)) 
+0(X r+2 (e)). 

(64) 

Combining (|62|l and (|64|l . results in 

det(AI-L e ) = A 2r (e)(det(A I-M)) 2 
-e 2 Tr((adj(A I-M)N) 2 ) + 0(max{A 2r+1 (e), e 2 A(e)}) = 0. """ 

This naively suggests that |A(e)| ~ e l ' r . However, unless M has Jordan 
blocks, adj(AI — M)| A A = 0. We will consider M to be diagonalizable, in 

order to discern the behavior of A(e). Thus, in the basis where M — > Am 
(i.e. the basis where M is diagonal), the following holds: 

— adj(AI-A M )| A=Ao =0 Forah><r-l. (66) 
In this case, we find 

adj(AI - Am)| a=Ao+a(£) = ^]^T a dj(AI - A M ) + 0(X r (e)) 



V l(e) -dct(AoI-M)f ° T ° j +Oi\'U)u 



(r-1)! v u ' \ I 



(67) 
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Combining l|t)5|) and 16 7|) and provided that there exists a transforma- 
tion V such that V _1 MV = A M and V _1 NV = N we obtain the result 



det(AI - L e ) = A 2r (e)(det(A I - M)) 2 - e 2 A 2r - 2 (e)Tr(N 22 ) 
+0(max{A 2 ''+ 1 (e),e 2 A 2 '- 1 (e)}) = 0, 
where N22 is an r x r matrix that comes from 



(68) 



Hence we finally arrive at 

l A ( £ )l = (^rTyrVl^^l+o^ 2 ). (70) 

C.2 Consequences: Results put in context 

For r = 1 and N is Hermitian, i|70|) directly implies that the i th eigenvalue 
of M is perturbed as 

|Ai(e)| = e|N«| + 0(e 2 ) < e ||N||oo + 0(e 2 ). (71) 

This provides an upper bound on how the perturbation shifts the spectrum 
of the unperturbed matrix. 

From Appendix^] we can make the following associations; e = a, M = 
Aq + Aq 1 , and N = A^ 2 — I. Our objective is to roughly determine the size 
of "a" such that the ordering of the Hankel singular values are preserved. 
As in Section f3.ll given that Aj(fi) < Afc(f2) for all j < k, let a be a 
permutation such that X a (j) (O) + ^a(j) 0^0 — A aO) (^) + A a(fc) (^) ^ or au 
fc < j. Though it is somewhat conservative, the ordering about the i th 
unperturbed Hankel singular value is guaranteed provided that 



a ( l A a( 4 ) _ !| + l A a( l+ l) _ X l ) ^ A Q W + A «W _ A a( 4 +1) + A a(i+1) , g 
a ~ !| + l A a(i) " ^ A a( X i-l) + A a(»-1) ~ A a( t ) + A "W • 



It follows that the ordering of the HSV is guaranteed to be preserved pro- 
vided that 

A «(.) + Aq « ~ A «(i+i) ~ A «(*+i) ft7Q . 
a < mm — -f — 5 ■ , ; ■ (73) 
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